Gemma 4
Introducing Gemma 4 12B
Novel unified architecture: No multimodal encoders. The vision and audio inputs flow directly into the LLM backbone.
Advanced reasoning: Benchmark performance nearing our 26B model, unlocking powerful multi-step reasoning and agentic workflows.
Laptop ready: Small enough to run locally with just 16GB of VRAM or unified memory.
Open and accessible: Released under an Apache 2.0 license with support across the developer ecosystem.
Drafter-ready: Gemma 4 12B comes equipped with Multi-Token Prediction (MTP) drafters to reduce latency.
ollama
gemma4